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Abstract 

In this paper a way is suggested for calculating the probability of 
consecutive numbers strings within a sequence of n numbers randomly 
drawn (without replacement) among the set of the first consecutive 
numbers, with N ^ n. 

An explicit derivation is carried out for the special case of Super- 
Enalotto, nowadays the most famous lottery in Italy, with A^ = 90 
and n = 6. It turns out that, on average, one every three drawings 
presents one or more consecutive numbers strings inside. 



1 Introduction 

Among lotteries and gambling games endorsed by the Italian government, 
SuperEnalotto is surely the most popular one, even in the neighbor states 
like France, Austria and Switzerland. It has been introduced for the first 
time on December 3, 1997 and since then it has provided some of the biggest 
payoffs of all lotteries in the world [1]. 

The game is currently issued three times a week. The player gains the 
jackpot (which increases at each issue, depending on the total number of 
players each time and on the number of the previous not winning issues) if 
she matches the 6 numbers drawn out of 90 (from 1 to 90), regardless of the 
order in which the numbers are drawn. 

Concerning the winning odds, SuperEnalotto is considered to be one of the 
most difficult gambling game among the existing ones: winning the jackpot 
means to match the drawn un-ordered collection of six numbers among all 
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the possible 6-combinations of 90 numbers. It is easy to see that there are 
(^g°) of these combinations, so the winning odds at every issue of the game 
are 1 in 622, 614, 630. 

To give an idea of the tininess of the figures at play, the odds you have 
to gain the jackpot in the next issue of the game are comparable with the 
odds you have to be hit, here on the Earth, by a kilometre-sized asteroid 
during the very same day you play the game! Obviously, this does not mean 
that no one will never gain the jackpot: many people play the game, thus 
the expectation value of winnings is not tiny at all. 

Although the author is not a greedy player of SuperEnalott^ his atten- 
tion was drawn by an apparently counter-intuitive high frequency with which 
some peculiar number patterns appear within the six drawn numbers. In 
particular, he noted an unexpected high occurrenc^ of consecutive numbers, 
namely like those in the last drawings of year 2009 as reported in Table [H 

Among conspiracy-inclined people, and there is plenty of them between 
greedy players, this feature is read as the unequivocal sign that drawings are 
intentionally biased by the lottery operator. 

On the contrary, the first though of the author was to understand if 
the laws of probability are able to explain such apparently bizarre behavior. 
Obviously, they do and in the following Section it is shown how. 

2 Consecutive numbers probability 

First of all, let us define the consecutive numbers probability C{n,N): this 
is the probability to have one or more strings of consecutive numbers (of 
all lengths, starting from strings of 2 consecutive numbers up to strings of 
n consecutive numbers) within a sequence of n numbers randomly drawn 
from the set {1, 2, 3, 4, A^} of the first A^ consecutive numbers. For the 
sake of simplicity, the consecutive numbers probability is explicitly derived 
here only for the special case with A^ = 90, and n = 6, namely the case 
of SuperEnalotto. Generalizations of such result should not pose particular 
difficulties. 

In what follows every drawn sequence is though as written with numbers 
in increasing order (as in Table [1]) and in the following format: 

Si 0,1 $2 O2 S3 03 S4 04 S5 05 Sg Oq Sj, (1) 

^Probably because he is pretty confident that he will never be hit by a kilometre-sized 
asteroid in his whole life. 

^Indeed, this is a personal impression, not an objective fact. 
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Table 1: SupeEnalotto drawings in the last three months of year 2009. In 
bold: consecutive numbers strimgs. 

where the Oj are the drawn numbers (with aj < a^, if z < j), while the Sk 
are numbers which express the distance between two contiguous Oj. Namely, 
82 = ((32 — CLi) — 1, = (as — 02) — 1, and so on, while si = ai — 1 and 
s-i = (90 — ae). As a particular case si = 0, if ai = 1, and = 0, if ag = 90. 
An important arithmetical relation then holds: 

Sl + S2 + S3 + S4 + S5 + S6 + S7 = 90 - 6 = 84, (2) 

with 0<Sk< 84. 

Equation ([2]) allows to cope with consecutive numbers within the drawn 
sequence: as a matter of fact, one has consecutive numbers strings when at 
least one of the value {s2, S3, S4, S5, sq} is equal to zero. The values of si and 
Sy are not important for the consecutive numbers issue. 

Thus, in order to calculate the numerical value of C(6,90), one has to 
count all the ways of writing the integer 84 as a sum of the numbers {si, S2, 
S3, S4, S5, se, S7}, with at least one of the value {s2, S3, S4, S5, sg} equal to 
zero: in fact, C(6, 90) is equal to the ratio between this number of ways (let 
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call it Nc) and the ways of writing 84 as the sum of the numbers {si, S2, S3, 
S4, S5, Sg, S7}, without any proviso on their values, namely, with < < 84. 
Let call this last number Nt. 

This is a partition problem and like most of partition problems it can be 
solved thanks to the technique of generating functions. 

Consider first the calculation of the denominator of the above probability 
fraction, the number A^^. 

Every Sk spans from to 84; thus, let us expand the following polynomial 
power: 

{l + x + x^ + x' + --- + x'' + x'' + x'y = (^^)'- (3) 

After the expansion, the multiplicative coefficient of the term x^"^ provides 
exactly the number Nf. such coefficient sums up the number of way in which 
the term x^^ can be obtained as a product of the seven terms x^ (one for each 
Sk), and thus, it sums up also the ways in which the exponent of can be 
obtained as a sum of the seven exponents /, with < / < 84. 

Applying multiple derivative on eq. and making a suitable normaliza- 
tion (to simplify the numerical coefficient equal to 84! which originates from 
the exponent of after multiple derivative), a relatively simple algorithm 
for calculating the coefficient of can be obtained as follows: 



84! dx84 



^85 _ IN 7- 



X 



= 622,614,630. (4) 

x=0 



The numerical result in eq. 01]) has been obtained in less than a minute 
with free math software Sage [3], running on a not so brand-new laptop. 
The reader must have recognized such figure. It is the total number of 6- 
combinations of the set of 90 numbers, namely ( , as it should be. 

The next step is to calculate the number Nc- This task is a bit more in- 
volved, although it does not require anything new with respect to the deriva- 
tion of the number A^^. According to the format of the drawn sequence 
introduced in eq. ([T]), one has consecutive numbers when at least one of the 
five numbers {s2, S3, S4, S5, se} is equal to zero. 

Let us start with the simplest case when only one Sk among {s2, S3, S4, 
S5, Sg} is equal to zero while the other ones are different from zero. This case 
corresponds to the situation in which there is only a single pair of consecutive 
numbers within the drawn sequence. 

First of all, there are (^) = 5 cases in which one such Sk may be equal to 
zero, since there are exactly five numbers s^. Moreover, for each one of such 
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cases one must count the ways in which the relation ([2]) is fulfilled with one 
Sk equal to zero and the other ones assuming values 1 < < 84. 

The arguments introduced above suggest that the this number of ways 
is equal to the coefficient of the term in the expansion of the following 
polynomial powers product: 



{l+xW+- ■ ■+x'^+x'')\xW+- ■ .+x''+x'^)^ = ^-1 

\ X — 1 J \ X — 1 J 

(5) 

where the factor to the left on both sides (that with exponent equal to 2) 
takes into account the behavior of terms Si and Sy (that can be also equal to 
zero), while the factor to the right on both sides (that with exponent equal 
to 4) takes into account the behavior of the four terms among {s2, S3, S4, S5, 
sq} which are never equal to zero. The one term Sk which is zero counts as 
a factor equal to 1 in the product ([5]). 

Hence, the total number of way A^^i in which there could be only a single 
pair of consecutive numbers within the drawn sequence is then given by: 



1 

84! 



X 



85 



X 



X 



85 



X 



164,007,585. (6) 



a:=0 



Again, the factor is the normalization factor needed to simplify the 
'spurious' numerical coefficient which originates from the exponent of x^^ 
after multiple derivative. 

It should be now evident how to calculate the number of possible drawn 
sequence with consecutive numbers originating from two, three, four or all 
the five numbers Sk being equal to zero. Let us call N2, N3, iV4, such 
numbers with respectively two, three, four and all the five numbers s^ equal 
to zero. Applying the same reasoning as for eq. ([6]), one has: 



A^2 = — r ■ 
^ 84! 
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A^3 = — r ■ 
^ 84! 



dx^^ 
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987,700 (8) 
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(9) 
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1 /5\ d^^ 



84! V5/ dx^^ 



x^^ - 1 



= 85. (10) 

a;=0 



The numbers thus count the amount of sequences with i pairs of 
consecutive numbers inside (for example, in the sequence {1, 3, 13, 14, 15, 87}, 
two pairs of consecutive numbers are present, {13, 14} and {14, 15}, and 
hence this sequence is counted in N'^. 

Summarizing, the sought number is thus equal to the sum: 

AT^ = A^i + + ATg + + A^s = 185, 261, 070 (11) 
and the sought consecutive numbers probability is then: 

, iVc 185,261,070 ^ , , 

C 6, 90 = — = ^ 29.75%. 12 

^ ' ' Nt 622,614,630 ^ ' 

This outcome says that, on average, nearly one drawing every three results 
in the consecutive numbers phenomenon; it is definitely a quite common 
result. 

A direct comparison between expected and observed consecutive numbers 
frequencies can be easily made taking the data in Table [T] as a statistical 
sample. The observed frequency amounts to ^ ~ 32%, which is slightly 
higher than that predicted by eq. (fT2|) . This is probably due to the smallness 
of the statistical sample taken into account. 

According to the official statistics [4j on all the drawings up to now (span- 
ning from the last days of 1997 to the whole year 2009), the number of con- 
secutive numbers occurrences A^^^ is equal to 454, while the total number of 
drawings At amounts to 1507; hence, the observed frequency calculated on 
the widest possible statistical sample results to be ~ 30, 1%. This figure 
is reasonably close to that predicted by eq. f|T2|) . 

Another useful result that can be easily derived is the probability P{n, M) 
that among a total of M drawings n drawings occur with consecutive num- 
bers. It is not difficult to realize that P{n, M) can be expressed as a straight- 
forward binomial distribution with success probability equal to C(6, 90): 

P(n, M) = (^^^ C(6, 90)" (1 - C(6, 90))^^"", (13) 
with mean fi and standard deviation cr equal to: 
fi = M-C(6,90) 

a = ^yM■C{6,90) ■ (1 -C(6,90)) 
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Consecutive numbers categories 
(see text for the meaning of Ni) 


Observed occurrences [1] 


Expected occurrences (eq. 

± cr 


Ni 


396 


~ 397 ± 17 


N2 


53 


-49 ±7 


N3 


5 


~2±2 


Ni 





~0 


N^o 





- 


Nc 


454 


448 ± 18 



Table 2: Detailed comparison between observed [4J and expected (eq. f fT3|) ) 
occurrences of consecutive numbers strings. As explained in Section 2, Ni 
stands for the number of sequence with i pairs of consecutive numbers 
(e.g. a sequence like {1,3,13,14,15,87} has two pairs of consecutive num- 
bers, {13,14} and {14,15}). The observed occurrences are counted among 
all the 1507 drawings from December 7, 1997, up to the whole year 2009. 
The expected occurrences are calculated with the mean and the standard 
deviation of the binomial distribution (eq. (fT3l) ). with success probability 
equal to 



In the case of M = 1507 (namely the number of all the SuperEnalotto 
drawings to date) the mean fi of the distribution (IT^ is equal to 448 and the 
standard deviation a is nearly equal to 18. Thus, the observed number of 
drawings with consecutive numbers strings, 454, is well within the expected 
one 448±18. Table [2] contains also a detailed comparison between observed [1] 
and expected (eq. f[T^ ) occurrences of consecutive numbers strings for each 
category N^, N2, N3, N^, N^. 

3 Concluding Remarks 

In this paper a way has been suggested for calculating the probability of 
consecutive numbers strings within a sequence of n numbers randomly drawn 
(without replacement) among the set of the first consecutive numbers, with 
> n. 

An explicit derivation has been carried out for the special case of Su- 
perEnalotto, nowadays the most famous lottery in Italy, with A^ = 90 and 
n = 6. It turned out that, on average, one every three drawings presents one 
or more consecutive numbers strings inside: a posteriori, this is obviously not 
surprising; a priori, on the other hand, it admittedly appears so, at least to 
the author. 

One reasonably expects that if the ratio n/N increases, then the consec- 



7 



utive numbers probability will become higher. In the limit, it is not difficult 
to prove that if n > ^^y^, for even, or if n > ^^y^, for odd, the proba- 
bility is equal to 1. This comes from a direct application of the Pigeonhole 
Principle. 

One also may think to use the results obtained in Section 2, along with 
some statistical goodness-to-fit tests to quantitatively compare observed and 
expected occurrences, as a quality control on the lottery operators activity. 
Similar kind of checks are now common against accounting data and tax 
frauds, e.g. using the ffist-digits distribution or Benford's law [5J. 

Artifactual, man-made number sequences usually show a lower occurrence 
of consecutive numbers strings: commonly, people tend to severely underesti- 
mate the presence of consecutive numbers strings within randomly generated 
number sequences. 
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